AITopics | human-level performance

Collaborating Authors

human-level performance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Matching domain experts by training from scratch on domain knowledge

Luo, Xiaoliang, Sun, Guangzhi, Love, Bradley C.

arXiv.org Artificial IntelligenceJul-2-2024

Recently, large language models (LLMs) have outperformed human experts in predicting the results of neuroscience experiments (Luo et al., 2024). What is the basis for this performance? One possibility is that statistical patterns in that specific scientific literature, as opposed to emergent reasoning abilities arising from broader training, underlie LLMs' performance. To evaluate this possibility, we trained (next word prediction) a relatively small 124M-parameter GPT-2 model on 1.3 billion tokens of domain-specific knowledge. Despite being orders of magnitude smaller than larger LLMs trained on trillions of tokens, small models achieved expert-level performance in predicting neuroscience results. Small models trained on the neuroscience literature succeeded when they were trained from scratch using a tokenizer specifically trained on neuroscience text or when the neuroscience literature was used to finetune a pretrained GPT-2. Our results indicate that expert-level performance may be attained by even small LLMs through domain-specific, auto-regressive training approaches.

brainbench, neuroscience literature, training data, (16 more...)

arXiv.org Artificial Intelligence

2405.09395

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.78)

Add feedback

Image recognition accuracy: An unseen challenge confounding today's AI

AIHubJan-3-2024, 14:41:13 GMT

MVT, minimum viewing time, is a dataset difficulty metric measuring the minimum presentation time required for an image to be recognized. Researchers hope this metric will be used to evaluate models' performance and biological plausibility and guide the creation of new more difficult datasets, leading to new computer vision techniques that perform better in real life. Imagine you are scrolling through the photos on your phone and you come across an image that at first you can't recognize. It looks like maybe something fuzzy on the couch; could it be a pillow or a coat? That ball of fluff is your friend's cat, Mocha.

artificial intelligence, machine learning, pattern recognition, (16 more...)

AIHub

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.40)

Add feedback

GPT-4 Technical Report

OpenAI, null, :, null, Achiam, Josh, Adler, Steven, Agarwal, Sandhini, Ahmad, Lama, Akkaya, Ilge, Aleman, Florencia Leoni, Almeida, Diogo, Altenschmidt, Janko, Altman, Sam, Anadkat, Shyamal, Avila, Red, Babuschkin, Igor, Balaji, Suchir, Balcom, Valerie, Baltescu, Paul, Bao, Haiming, Bavarian, Mo, Belgum, Jeff, Bello, Irwan, Berdine, Jake, Bernadett-Shapiro, Gabriel, Berner, Christopher, Bogdonoff, Lenny, Boiko, Oleg, Boyd, Madelaine, Brakman, Anna-Luisa, Brockman, Greg, Brooks, Tim, Brundage, Miles, Button, Kevin, Cai, Trevor, Campbell, Rosie, Cann, Andrew, Carey, Brittany, Carlson, Chelsea, Carmichael, Rory, Chan, Brooke, Chang, Che, Chantzis, Fotis, Chen, Derek, Chen, Sully, Chen, Ruby, Chen, Jason, Chen, Mark, Chess, Ben, Cho, Chester, Chu, Casey, Chung, Hyung Won, Cummings, Dave, Currier, Jeremiah, Dai, Yunxing, Decareaux, Cory, Degry, Thomas, Deutsch, Noah, Deville, Damien, Dhar, Arka, Dohan, David, Dowling, Steve, Dunning, Sheila, Ecoffet, Adrien, Eleti, Atty, Eloundou, Tyna, Farhi, David, Fedus, Liam, Felix, Niko, Fishman, Simón Posada, Forte, Juston, Fulford, Isabella, Gao, Leo, Georges, Elie, Gibson, Christian, Goel, Vik, Gogineni, Tarun, Goh, Gabriel, Gontijo-Lopes, Rapha, Gordon, Jonathan, Grafstein, Morgan, Gray, Scott, Greene, Ryan, Gross, Joshua, Gu, Shixiang Shane, Guo, Yufei, Hallacy, Chris, Han, Jesse, Harris, Jeff, He, Yuchen, Heaton, Mike, Heidecke, Johannes, Hesse, Chris, Hickey, Alan, Hickey, Wade, Hoeschele, Peter, Houghton, Brandon, Hsu, Kenny, Hu, Shengli, Hu, Xin, Huizinga, Joost, Jain, Shantanu, Jain, Shawn, Jang, Joanne, Jiang, Angela, Jiang, Roger, Jin, Haozhun, Jin, Denny, Jomoto, Shino, Jonn, Billie, Jun, Heewoo, Kaftan, Tomer, Kaiser, Łukasz, Kamali, Ali, Kanitscheider, Ingmar, Keskar, Nitish Shirish, Khan, Tabarak, Kilpatrick, Logan, Kim, Jong Wook, Kim, Christina, Kim, Yongjik, Kirchner, Hendrik, Kiros, Jamie, Knight, Matt, Kokotajlo, Daniel, Kondraciuk, Łukasz, Kondrich, Andrew, Konstantinidis, Aris, Kosic, Kyle, Krueger, Gretchen, Kuo, Vishal, Lampe, Michael, Lan, Ikai, Lee, Teddy, Leike, Jan, Leung, Jade, Levy, Daniel, Li, Chak Ming, Lim, Rachel, Lin, Molly, Lin, Stephanie, Litwin, Mateusz, Lopez, Theresa, Lowe, Ryan, Lue, Patricia, Makanju, Anna, Malfacini, Kim, Manning, Sam, Markov, Todor, Markovski, Yaniv, Martin, Bianca, Mayer, Katie, Mayne, Andrew, McGrew, Bob, McKinney, Scott Mayer, McLeavey, Christine, McMillan, Paul, McNeil, Jake, Medina, David, Mehta, Aalok, Menick, Jacob, Metz, Luke, Mishchenko, Andrey, Mishkin, Pamela, Monaco, Vinnie, Morikawa, Evan, Mossing, Daniel, Mu, Tong, Murati, Mira, Murk, Oleg, Mély, David, Nair, Ashvin, Nakano, Reiichiro, Nayak, Rajeev, Neelakantan, Arvind, Ngo, Richard, Noh, Hyeonwoo, Ouyang, Long, O'Keefe, Cullen, Pachocki, Jakub, Paino, Alex, Palermo, Joe, Pantuliano, Ashley, Parascandolo, Giambattista, Parish, Joel, Parparita, Emy, Passos, Alex, Pavlov, Mikhail, Peng, Andrew, Perelman, Adam, Peres, Filipe de Avila Belbute, Petrov, Michael, Pinto, Henrique Ponde de Oliveira, Michael, null, Pokorny, null, Pokrass, Michelle, Pong, Vitchyr, Powell, Tolly, Power, Alethea, Power, Boris, Proehl, Elizabeth, Puri, Raul, Radford, Alec, Rae, Jack, Ramesh, Aditya, Raymond, Cameron, Real, Francis, Rimbach, Kendra, Ross, Carl, Rotsted, Bob, Roussez, Henri, Ryder, Nick, Saltarelli, Mario, Sanders, Ted, Santurkar, Shibani, Sastry, Girish, Schmidt, Heather, Schnurr, David, Schulman, John, Selsam, Daniel, Sheppard, Kyla, Sherbakov, Toki, Shieh, Jessica, Shoker, Sarah, Shyam, Pranav, Sidor, Szymon, Sigler, Eric, Simens, Maddie, Sitkin, Jordan, Slama, Katarina, Sohl, Ian, Sokolowsky, Benjamin, Song, Yang, Staudacher, Natalie, Such, Felipe Petroski, Summers, Natalie, Sutskever, Ilya, Tang, Jie, Tezak, Nikolas, Thompson, Madeleine, Tillet, Phil, Tootoonchian, Amin, Tseng, Elizabeth, Tuggle, Preston, Turley, Nick, Tworek, Jerry, Uribe, Juan Felipe Cerón, Vallone, Andrea, Vijayvergiya, Arun, Voss, Chelsea, Wainwright, Carroll, Wang, Justin Jay, Wang, Alvin, Wang, Ben, Ward, Jonathan, Wei, Jason, Weinmann, CJ, Welihinda, Akila, Welinder, Peter, Weng, Jiayi, Weng, Lilian, Wiethoff, Matt, Willner, Dave, Winter, Clemens, Wolrich, Samuel, Wong, Hannah, Workman, Lauren, Wu, Sherwin, Wu, Jeff, Wu, Michael, Xiao, Kai, Xu, Tao, Yoo, Sarah, Yu, Kevin, Yuan, Qiming, Zaremba, Wojciech, Zellers, Rowan, Zhang, Chong, Zhang, Marvin, Zhao, Shengjia, Zheng, Tianhao, Zhuang, Juntang, Zhuk, William, Zoph, Barret

arXiv.org Artificial IntelligenceDec-18-2023

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based model pre-trained to predict the next token in a document. The post-training alignment process results in improved performance on measures of factuality and adherence to desired behavior. A core component of this project was developing infrastructure and optimization methods that behave predictably across a wide range of scales. This allowed us to accurately predict some aspects of GPT-4's performance based on models trained with no more than 1/1,000th the compute of GPT-4.

example prompt and completion, information processing system, infrastructure and optimization method, (16 more...)

arXiv.org Artificial Intelligence

2303.08774

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Spain (0.04)
Europe > Monaco (0.04)
(12 more...)

Genre:

Research Report (1.00)
Personal > Interview (0.45)

Industry:

Media (1.00)
Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(14 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.47)

Add feedback

OpenAI's GPT-4 exhibits "human-level performance" on professional benchmarks

#artificialintelligenceMar-17-2023, 20:20:33 GMT

On Tuesday, OpenAI announced GPT-4, a large multimodal model that can accept text and image inputs while returning text output that "exhibits human-level performance on various professional and academic benchmarks," according to OpenAI. Also on Tuesday, Microsoft announced that Bing Chat has been running on GPT-4 all along. If it performs as claimed, GPT-4 potentially represents the opening of a new era in artificial intelligence. "It passes a simulated bar exam with a score around the top 10% of test takers," writes OpenAI in its announcement. OpenAI plans to release GPT-4's text capability through ChatGPT and its commercial API, but with a waitlist at first.

gpt-4, human-level performance, openai, (6 more...)

#artificialintelligence

Industry: Education (0.44)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

OpenAI Reveals 'Human-Level Performance' GPT-4 That Passed Bar Exam Among Top 10%

International Business TimesMar-17-2023, 07:45:03 GMT

OpenAI has revealed that GPT-4, the latest version of its primary large language model, exhibits "human-level performance" on various professional and academic tests, including passing a simulated bar exam in the top 10% of test takers. The update is a huge improvement from GPT-3.5, which scored around the bottom 10%, OpenAI said in an announcement Tuesday. GPT-4, which learns its skills by analyzing huge amounts of data culled from the internet, was designed to power artificial intelligence chatbots such as Bing's AI chat and OpenAI's ChatGPT as well as various other systems, from business software to personal online tutors. OpenAI said in a blog post that the new model is "more creative and collaborative than ever before" and "can solve difficult problems with greater accuracy, thanks to its broader general knowledge and problem-solving abilities." "The difference comes out when the complexity of the task reaches a sufficient threshold," OpenAI wrote.

gpt-4, human-level performance, openai, (9 more...)

International Business Times

Industry: Education > Educational Setting > Online (0.33)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

GPT-4: OpenAI says its AI has reached 'human-level performance'

New ScientistMar-14-2023, 21:01:36 GMT

The AI behind popular chatbot ChatGPT has been updated to a new version known as GPT-4 – and many people have already been unknowingly exposed to the newest AI's supposedly improved capabilities for weeks prior to the announcement. OpenAI, the company that developed GPT-4, says it "spent 6 months making GPT-4 safer and more aligned" so that the AI is less likely to produce "disallowed content" in response to human users' queries. GPT-4 delivers "human-level performance" and outperforms its predecessor GPT-3.5 on many simulated exams …

gpt-4, human-level performance, openai

New Scientist

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.79)

Add feedback

GPT-4: OpenAI says its AI has 'human-level performance' on tests

New ScientistMar-14-2023, 21:01:36 GMT

gpt-4, human-level performance, openai

New Scientist

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.80)

Add feedback

In reinforcement learning, slower networks can learn faster - Amazon Science

#artificialintelligenceDec-12-2022, 12:45:05 GMT

Reinforcement learning (RL) is an increasingly popular way to model sequential decision-making problems in artificial intelligence. RL agents learn through trial and error, repeatedly interacting with the world to learn a policy that maximizes a reward signal. RL agents have recently achieved remarkable results when used in conjunction with deep neural networks. Chief among these so-called deep-RL results is the 2015 paper that introduced the Deep Q Network (DQN) agent, which surpassed human-level performance on a large set of Atari games. A core component of DQN is an optimizer that adapts the parameters of the neural network to minimize the DQN objective.

agent, algorithm, proximal update, (15 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games > Computer Games (0.62)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

Strategic Management of Machine Learning Projects

#artificialintelligenceAug-24-2022, 08:20:29 GMT

You can sometimes break an end-to-end model into two and introduce a hand-designed component in the middle that extracts some features or does some processing to make the whole system much better. For instance, you might find that a model where there is a hand-designed component that crops to the person's face before starting on the facial recognition task when a human is found to exist in an image makes a better face recognition system compared to one that's completely end-to-end.

dev, dev and test, machine learning project, (16 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.88)

Add feedback

Deep Learning For Compliance Checks: What's New? - KDnuggets

#artificialintelligenceMay-12-2022, 16:00:31 GMT

Natural Language Processing (NLP) has long played a significant role in the compliance processes for major banks around the world. By implementing the different NLP techniques into the production processes, compliance departments can maintain detailed checks and keep up with regulator demands. All of these areas can benefit from document processing and the use of NLP techniques to get through the process more effectively. Certain verification tasks fall beyond the realm of using traditional, rules-based NLP systems. This is where deep learning can help fill these gaps, providing smoother and more efficient compliance checks. There are several challenges that make the rules-based system more complicated to use when undergoing check routines.

compliance check, deep learning, screening, (12 more...)

#artificialintelligence

Country: Asia > Middle East > Iran (0.05)

Industry: Banking & Finance (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.99)

Add feedback